287 research outputs found
Fine-To-Coarse Global Registration of RGB-D Scans
RGB-D scanning of indoor environments is important for many applications,
including real estate, interior design, and virtual reality. However, it is
still challenging to register RGB-D images from a hand-held camera over a long
video sequence into a globally consistent 3D model. Current methods often can
lose tracking or drift and thus fail to reconstruct salient structures in large
environments (e.g., parallel walls in different rooms). To address this
problem, we propose a "fine-to-coarse" global registration algorithm that
leverages robust registrations at finer scales to seed detection and
enforcement of new correspondence and structural constraints at coarser scales.
To test global registration algorithms, we provide a benchmark with 10,401
manually-clicked point correspondences in 25 scenes from the SUN3D dataset.
During experiments with this benchmark, we find that our fine-to-coarse
algorithm registers long RGB-D sequences better than previous methods
Neural Illumination: Lighting Prediction for Indoor Environments
This paper addresses the task of estimating the light arriving from all
directions to a 3D point observed at a selected pixel in an RGB image. This
task is challenging because it requires predicting a mapping from a partial
scene observation by a camera to a complete illumination map for a selected
position, which depends on the 3D location of the selection, the distribution
of unobserved light sources, the occlusions caused by scene geometry, etc.
Previous methods attempt to learn this complex mapping directly using a single
black-box neural network, which often fails to estimate high-frequency lighting
details for scenes with complicated 3D geometry. Instead, we propose "Neural
Illumination" a new approach that decomposes illumination prediction into
several simpler differentiable sub-tasks: 1) geometry estimation, 2) scene
completion, and 3) LDR-to-HDR estimation. The advantage of this approach is
that the sub-tasks are relatively easy to learn and can be trained with direct
supervision, while the whole pipeline is fully differentiable and can be
fine-tuned with end-to-end supervision. Experiments show that our approach
performs significantly better quantitatively and qualitatively than prior work
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
While there has been remarkable progress in the performance of visual
recognition algorithms, the state-of-the-art models tend to be exceptionally
data-hungry. Large labeled training datasets, expensive and tedious to produce,
are required to optimize millions of parameters in deep network models. Lagging
behind the growth in model capacity, the available datasets are quickly
becoming outdated in terms of size and density. To circumvent this bottleneck,
we propose to amplify human effort through a partially automated labeling
scheme, leveraging deep learning with humans in the loop. Starting from a large
set of candidate images for each category, we iteratively sample a subset, ask
people to label them, classify the others with a trained model, split the set
into positives, negatives, and unlabeled based on the classification
confidence, and then iterate with the unlabeled set. To assess the
effectiveness of this cascading procedure and enable further progress in visual
recognition research, we construct a new image dataset, LSUN. It contains
around one million labeled images for each of 10 scene categories and 20 object
categories. We experiment with training popular convolutional networks and find
that they achieve substantial performance gains when trained on this dataset
- …